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Prediction  and  Inference  with  Incomplete  Probabilistic 
Knowledge 

AFOSR  Grant  #F49620-96-l-0307 
Final  Report 

Status  of  Effort  and  Background  Description 

A  principal  goal  of  modem  reliability  analysis  is  to  design  methods 
for  appraising  the  reliability  of  large,  logically  complex  systems  that  must 
operate  at  exceptionally  high  levels  of  reliability,  but  are  subject  to  failure  at 
uncertain  times  in  uncertain  ways.  Nuclear  power  plants,  the  space  shuttle, 
DAPTA.  net  and  fly-by-wire  aircraft  are  examples  of  such  systems. 

Current  practice  is  to  design  a  probabilistic  model  of  the  stochastic 
behavior  of  a  complex  system  and  to  augment  it  with  numerical  appraisal  of 
the  most  important  properties  of  the  joint  probability  law  that  the  model 
entails.  A  very  different  approach  is  to  model  the  logical  structure  of  a 
system  so  as  to  identify  all  logically  obtainable  events  and  to  not  model  a 
probability  law  that  governs  event  occurrence  exactly-  In  place  of  assigning 
a  specific  probability  law  to  a  sample  space  that  contains  all  obtainable 
system  events,  ask  system  analysts  to  assign  numerical  probabilities  to  some 
events  that  lie  within  their  domain  of  expertise.  This  recasts  the  problem  and 
changes  the  focus  of  the  computational  task. 

For  a  fully  specified  model,  the  task  is: 


l 


20010404  114 


MAR-15-2001  THU  09:51  AM 


FAX  NO. 


r.  04 


Given  initial  conditions,  a  joint  probability  law  for  events  and  either 
certain  or  probabilistic  knowledge  of  model  parameters,  calculate  the 
probabili  ty  of  occurrence  of  one  or  more  critical  events. 

The  task  for  the  alternative  is: 

Given  the  logical  structure  of  obtainable  events  and  some  numerical 
assessments  of  probabilities  of  these  events,  compute  bounds  on  the 
probabilities  of  one  or  more  critical  events  that  are  not  drectly 
assessed  without  attempting  to  specify  a  joint  probability  law  for  all 
events, 

A  sound  method  for  processing  complex  system  probability  assessments 
should: 

□  Allow  determination  of  the  coherence  or  incoherence  of  assessments, 

□  Enable  computation  of  coherent  bounds  on  probabilities  of  events  not 
directly  assessed, 

□  Allow  for  efficient  revision  of  bounds  in  light  of  additional  information 
in  the  form  of  expert  judgment  or  in  the  form  of  observation  of  the 
occurrence  of  an  event  or  is  complement.  It  mustalso  allow  for  efficient 
revision  when  additional  assessments  are  provided. 

□  Should  not  contradict  Bayesian  conditional ization, 

□  Be  computationally  tractable  for  realistic  problems  of  moderate  to  large 
size,  and 

□  Be  based  on  reasonable  assumptions  about  qualitative  features  of  the 
probability  law  governing  uncertain  system  events. 

De  Finelti’s  Fundamental  Theorem  of  Probability  (FTP)  provides  both  a 
conceptual  and  computational  framework  for  carrying  out  the  second  of  the 
above  programs  for  calculating  probabilities  of  critical  reliability  events.  In 
addition,  the  FTP  meets  all  of  the  conditions  for  a  sound  method  just 
described.  It  is  the  foundation  for  a  program  of  analysis  of  the  reliability  of 
complex  systems  that  exploits  and  extends  recent  research  at  the  intersection 
of  probabilistic  logic  and  mathematical  programming. 

Until  recently,  the  FTP  was  of  primary  conceptual  value  because  the  FTP 
prescribes  linear  and/or  non-linear  programming  problems  with 
exponentially  many  decision  variables,  problems  that  are  cannot  be  solved 
using  conventional  linear  programming  algorithms.  Numerical  assessments 
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of  probabilities  for  N  events  may  lead  to  a  linear  program  with0(2  ) 
decision  variables  which,  when  N  is  large,  is  not  directly  solvable  by  the 
simplex  method  or  any  other  direct  method. 

Accomplishments 

Our  research  moves  on  three  tracks:  development  of  algorithms  and 
software  to  support  solutions  to  FTP  type  problems,  investigation  of 
applications  of  the  FTP  to  realistic  problems  in  reliability  and  extensionof 
theory  to  allow  incorporation  of  assertions  about  qualitative  probabilistic 

structure, 

Algorithms 

We  have  constructed  an  automated  method  for  translating  system 
logic  into  algebraic  inequalities  and  coupled  this  method  with  a  column 
generation  algorithm,  the  Related  Integer  Program  [RIP],  which  enables  us 
to  solve  problems  of  moderate  size  (N=150)  with  complex  logical  structure. 
We  prove  the  existence  of  an  easily  computable  lower  bound  on  the  optimal 
“Master”  LP  problem  from  properties  of  the  RIP,  enabling  us  to  control 
tolerance  or  deviation  from  exact  optimality  easily.  This  algorithm  takes 
advantage  of  efficient  column  generation  methods  [see  Chandra  and  Hooker 
(1999,  Chapter  4)  for  a  review  of  column  generation  methods  in  this  setting. 

Jeremy  Cohen,  a  Masters  Degree  Candidate  in  Electrical  Engineering 
completed  his  thesis,  Implementation  and  Application  of  the  Fundamental 
Theorem  of  Probability,  M.  Eng.  this  June.  Jeremy’s  thesis  advances  the 
computational  capacity  of  the  RIP  algorithm  and  begins  investigation  of  the 
fundamental  problem  of  incorporating  an  expert’s  specification  of 
qualitative  probabilistic  structure  into  the  de  Finctti  framework.  (See 
Below). 

Fernando  Ordonnez,  a  Phd  candidate  in  Operations  Research,  worked 
with  Professor  Rob  Freund  and  me  on  improvements  in  algorithms,  on 
theoretical  extensions  of  the  FTP  and  on  application  of  new  results  to  the 
U.S.  Nuclear  Regulatory  Commission  Surry  PRA  Model. 

Theoretical  Extensions 
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Statements  about  the  qualitative  structure  of  uncertain  quantities 
embrace  conditional  independence  and  dependence  relations,  symmetry 
relations  (such  as  exchangeability)  and  stochastic  order  relations.  Some  of 
these  relations  are  easily  translated  into  algebraic  linear  equalities  and 
inequalities  which  then  permits  application  of  the  FTP  in  its  simplest  form. 
More  general  types  of  conditional  dependence  relations  lead  to  inequalities 
among  multilinear  forms.  Logicians  working  at  the  interface  of 
mathematical  programming  and  logic  have  suggested  use  of  non-linear 
programming  methods  to  deal  with  the  non-linear  problem  that  arises  from 
coupling  numerical  constraints  on  probabilities  with  specification  of 
qualitative  probabilistic  structure.  (See  Chandra  and  Hooker  op.cit.  for  a 
discussion.)  These  methods  are  cumbersome  and  computationally  difficult. 

Bayesian  networks,  Markov,  time-varying  and  semi-Markov  chains 
are  examples  of  probabilistic  systems  for  which  an  expert  might  provide 
both  numerical  appraisal  of  some  marginal  and  conditional  probabilities  as 
well  as  information  about  qualitative  probabilistic  dependence  relations 
governing  system  events— without  fully  specifying  a  joint  probability  law  for 
all  logically  realizable  events. 

Giovanni  Andreatta,  Professor  of  Operations  Research  at  the  University  of 
Padova  and  I  have  designed  two  algorithms  that  allow  application  of 
efficient  column  generation  techniques  designed  specifically  for  LP's  even 
when  the  master  problem  possesses  a  non-linear  objective  function  and 
nonlinear  constraints,  The  ability  to  do  this  rests  on  the  following 
propositions  (See  Qualitative  Probabilistic  Structure  and de Fineti'.'s 
Fundamental  Theorem  of  Probability] . 

□  Any  assertion  about  the  qualitative  probabilistic  structure  of  a  finite 
number  of  dichotomous  uncertain  quantities  can  be  represented  directly 
as  equalities  or  inequalities  among  elements  of  the  vector  of  FTP 
programming  decision  variables  orar  equalities  or  inequalities  among 
ratios  or  among  ratios  of  sums  of  these  decision  variables. 

a  A  conditional  dependence  relation  among  N  dichotomous  uncertain 
quantities  can  be  represented  as  a  system  of  linear  equations  in  the  2N  or 
less  probabilities  of  logically  possible  joint  events.  This  system  of 


1  Work  on  this  topic  has  proceeded  past  the  term  of  support  of  July  1996  to  September  1998. 
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equations  is  indexed  by  a  parameter  that  assumes  values  in  [0,1]  or  some 
subset  of  it 


These  propositions  enable  us  to  represent  assertions  about  qualitative 
probabilistic  structure  in  the  form  of  linear  constraints  with  continuous 
parameters.  We  avoid  the  necessity  of  dealing  with  constraints  among  multi¬ 
linear  forms  and  recast  the  general  problem  as  a  linear  programming 
problem  with  a  finite  number  of  (continuous)  parametric  constraints. 

These  new  algorithms  perform  well  on  the  moderate  sized  problems  tested 
thus  far,  They  have  not  yet  been  tested  on  large  problems  (N>100). 

Applications 

The  current  version  of  the  RIP  algorithm  has  been  tested  on  several 
nuclear  fault  tree  examples  of  moderate  size  with  complex  logical  structure 
induced  by  common  mode  failure  types.  [1] 

We  extend  the  scope  of  applications  to  include  Bayesian  Networks 
and  Fault  Detection  Systems  that  have  moderately  complex  logical  structure. 
Re-analysis  of  some  Bayesian  Network  problems  posed  by  Pearl  and  others 
will  allow  comparison  of  alternative  methods  of  analysis.  [2] 

In  particular,  we  address  questions  such  as,  “If  experts  provide  a 
Bayes  Network  describing  the  qualitative  structure  of  a  probabilistic  system, 
but  do  not  appraise  enough  numerical  probabilities  to  allow  direct 
computation  of  a  target  event,  what  does  the  FTP  tell  us  about  upper  and 
lower  bounds  on  the  probability  of  this  event  given  incomplete  probabilistic 
information?”  The  FTP,  coupled  with  the  algorithms  cited  above,  provides  a 
logically  sound  alternative  to  backward  and  forward  propagation  schemes 
suggested  in  the  literature. 

Other  Research 

In  addition  to  working  in  the  domain  of  reasoning  with  incomplete 
probabilistic  knowledge,  I  have  done  research  in  two  other  domains. 

First,  I  have  established  the  exacfdistribution  of  scaled  multivariate 
Normal  residuals  and  the  corresponding  exact  distributions  of  some  related 
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multivariate  statistics.  [3]  Letxi  x„  be  a  realization  of  n  (pxl)  random 
vectors  (rvs),  m  be  their  mean  and  S  be  the  unsealed  sample  covariance 
matrix  computed  via  xt  ,. . .  xn .  I  prove  that  the  density  of  a  sample 
standardized  version  S'ly2  [Xj  -  m]  of  a  generic  xj  possesses  a  simple, 
spherically  symmetric  functional  form  and  generalize  this  result  to  matric 
versions  of  scaled  residuals.  This  result  establishes  the  exact  distribution  of  a 
matric  analogue  of  univariate  scaled  residuals,  The  behavior  of  scaled 
residuals  has  long  been  a  benchmark  for  testing  univariate  and  multivariate 
normality,  so  I  believe  this  to  be  a  very  useful  result.  I  am  circulating  the 
working  paper  to  discover  connections  in  the  mulivariate  literature.  Thus  far 
I  have  found  no  papers  that  establish  this  distribution. 

Second,  a  colleague  pointed  out  that  recent  research  and  practice  in 
portfolio  analysis  goes  beyond  the  traditional  (Markowitz — Sharpe) 
minimization  of  variance  subject  to  meeting  budget  constraints  and 
achieving  at  least  a  target  rate  of  return.  The  aim  is  to  distinguish  probability 
of  loss  from  probability  of  gain.  Morgan-Stanley’s  VALUE  AT  RISK  and 
the  BASLE  agreement  establishing  portfolio  management  standards  for 
investment  banks  are  examples.  In  response,  I  prove  that  if  our  aim  is  to  find 
a  portfolio  that  maximizes  a  specified  fractile  of  portfolio  return  subject  to 
achieving  at  least  a  target  rate  of  return  we  may  be  led  to  concentrate  in 
place  of  diversify  in  instances  where  mean-variance  analysis  yields 
diversification.  [4] 

The  last  piece  that  I  did  on  software  reliability  Successive  Sampling 
and  Software  Reliability  appeared  in  the  Journal  of  Statistical  Planning  and 
Inference  [5]. 
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